Stochastic mixed model sequencing with multiple stations using reinforcement learning and probability quantiles

نویسندگان

چکیده

Abstract In this study, we propose a reinforcement learning (RL) approach for minimizing the number of work overload situations in mixed model sequencing (MMS) problem with stochastic processing times. The environment simulates times and penalizes overloads negative rewards. To account component problem, implement state representation that specifies whether will occur if are equal to their respective 25%, 50%, 75% probability quantiles. Thereby, RL agent is guided toward while being provided statistical information about how fluctuations affect solution quality. best our knowledge, study first consider variation minimization situations.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Multi-Objective Mixed-Model Assembly Line Sequencing Problem With Stochastic Operation Time

In today’s competitive market, those producers who can quickly adapt themselves todiverse demands of customers are successful. Therefore, in order to satisfy these demands of market, Mixed-model assembly line (MMAL) has an increasing growth in industry. A mixed-model assembly line (MMAL) is a type of production line in which varieties of products with common base characteristics are assembled o...

متن کامل

Multiple Model-Based Reinforcement Learning

We propose a modular reinforcement learning architecture for nonlinear, nonstationary control tasks, which we call multiple model-based reinforcement learning (MMRL). The basic idea is to decompose a complex task into multiple domains in space and time based on the predictability of the environmental dynamics. The system is composed of multiple modules, each of which consists of a state predict...

متن کامل

Reinforcement Learning using Kernel-Based Stochastic Factorization

Kernel-based reinforcement-learning (KBRL) is a method for learning a decision policy from a set of sample transitions which stands out for its strong theoretical guarantees. However, the size of the approximator grows with the number of transitions, which makes the approach impractical for large problems. In this paper we introduce a novel algorithm to improve the scalability of KBRL. We resor...

متن کامل

Reinforcement Learning with Multiple Demonstrations

Many tasks in robotics can be described as a trajectory that the robot should follow. Unfortunately, specifying the desired trajectory is often a non-trivial task. For example, when asked to describe the trajectory that a helicopter should follow to perform an aerobatic flip, one would have to not only (a) specify a complete trajectory in state space that intuitively corresponds to the aerobati...

متن کامل

Reinforcement Learning by Probability Matching

We present a new algorithm for associative reinforcement learning. The algorithm is based upon the idea of matching a network's output probability with a probability distribution derived from the environment's reward signal. This Probability Matching algorithm is shown to perform faster and be less susceptible to local minima than previously existing algorithms. We use Probability Matching to t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: OR Spectrum

سال: 2021

ISSN: ['0171-6468', '1436-6304']

DOI: https://doi.org/10.1007/s00291-021-00652-x